Network Benchmarking 



National Semiconductor 
Application Note 880 
John von Voros 
Bonnie Wilson 
February 1 993 



^ 



INTRODUCTION 

HOW DOES PERFORMS WORK? 

WHAT DELAYS AFFECT THROUGHPUT? 

DETERMINING THE PROPER ENVIRONMENT 

DETERMINING CLIENT SUITABILITY 

HOW DO I CHOOSE A CLIENT SOLUTION? 

SERVER SUITABILITY 

HOW DO I CHOOSE A SERVER SOLUTION? 

CONCLUSION 

INTRODUCTION 

As the competition for Ethernet® sockets and board level 
sales has increased, performance has become a means of 
differentiating a product from the "pack." Most marketers 
will quote figures given by a benchmark program known as 
PERFORMS which is a throughput test provided by 
Novell® In their driver development kit. The Intent of this 
program is to evaluate the stability of a driver and its associ- 
ated hardware, not necessarily the throughput. Unfortunate- 
ly, for lack of any other metric, this has become the de facto 
standard for the evaluation of performance. 
The results provided by PERFORMS are dependent on the 
adapter cards, their associated software drivers, the per- 
formance of the PC's used (I.e., 80486 machines vs 80286 
machines), the number of workstations, and even such 
things as the length of the interconnecting cable. Therefore, 
It is relatively easy to skew this data towards whatever result 
Is desired. For example, if you wanted to prove that your 
client solution was better than another, you might choose to 



use a fast PC for the client and a slow PC for the server. 
Data from this test must be used for relative comparison 
between cards in the same "fair" environment in order to be 
useful. 

HOW DOES PERFORMS WORK? 

When executing PERFORMS, the user must specify the 
range of file sizes to be transferred, the step size (i.e., use 
1K to 10K file sizes In 1K increments), and the amount of 
time that each file size will be tested (i.e., 30 seconds). 
PERFORMS creates a file of specified length and places 
that information in the cache memory of the server. This Is 
done to eliminate the delay caused by the hard disk drive. 
(Hard drives are almost always the rate limiting factor in a 
server-to-workstation transfer.) During the test interval, 
each workstation simultaneously requests the cached file. A 
file transfer Is accomplished by requesting multiple reads of 
a given packet length until the entire file is transferred 
(shown below). The number of bytes of overhead listed was 
taken from a protocol analyzer evaluation of a PERFORMS 
file transfer. 

1 . Workstation submits a "read file data request" for a file 
(57 Bytes). 

2. Server sends "read file data reply" which includes re- 
quested data + 54 Bytes of overhead. 

S. Workstation submits another "read file data request" for 

the next packet. 
4. Server sends "read file data reply" . . . 
The data given by this test will give a maximum and average 
throughput for the specified parameters. Reductions In total 
bandwidth are caused by software overhead, collisions, the 
preamble field, and the interframe gap. The maximum at- 
tainable throughput for a 1024 byte file can be calculated as 
follows: 



Quantity 


Description 


Time for Transfer (jiS) 


1 


Read File Data Request (57 Bytes) 


45.6 fiS 


1 


Read File Data Reply (1 K Data + 54 Bytes of 






Novell/Ethernet Overhead) 


862.4 fiS 


2 


Preamble Fields (8 Bytes) 


2x6.4 = 12.8 las 


2 


Interframe Gaps 


2x9.6 = 19.2 laS 



940jLtS 



1090 Kbytes/sec 



Note: Time for Transfer = Number of Bytes/1.25 Mbytes/sec. 



AT/LANTICTM is a tradeinark o\ National Semicondijctor Corporation. 
Elhernet® is a registered trademark of Xerox Corporation. 
Novell® is a registered trademark ot Novell, Inc. 
3Com® is a registered trademark of 3Com Corporation. 
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WHAT DELAYS AFFECT THROUGHPUT? 

A throughput of 1090 Kbytes/sec represents an ideal net- 
work in which there are no collisions or software delays. A 
more accurate model would take into account all of the rele- 
vant delays as shown in Figure 1. The delay (DO) caused by 
the redirector (TSR program) running on the client is re- 
quired to intercept BIOS calls that are made for network 
services. D1 and D5 represent the software overhead for 
the client drivers which provide the software interfaces to 
the Ethernet cards. In general, the size of the driver in mem- 
ory is proportional to the amount of delay introduced. Delays 
D2 and D4 are directly related to the efficiency and through- 
put of the network hardware. D3 is the cable delay which is 
a function of the cable length and type. The network operat- 
ing system introduces a delay in order to accomplish all of 
its tasks. Finally, D7 is the delay associated with the server 
disk drive which tends to dwarf the sum of the other delays. 
In the event of a collision, the two sending workstations that 
caused it will wait a random amount of time (determined by 
the random back-off algorithm specification of I.E.E.E. 
802.3) before retransmit. This will introduce a delay due to 
the time lost while transmitting the collided packet plus the 
wait for retransmit. Out of these delays, the manufacturer of 
an Ethernet solution can only control those related to the 
hardware interface, which is a function of the Ethernet con- 
troller chosen, and its software drivers. 



DETERMINING THE PROPER ENVIRONMENT 

The proper testing environment should use the same types 
of machines that will be used in the typical end network. In 
general, most vendors will not use more than 5-10 PCs in 
their test due to resource/time constraints. This number of 
workstations does, however, provide enough network traffic 
to represent a loaded network. For the purpose of this pa- 
per, several series of benchmarks will be presented with the 
environment listed in Figure 2. The PERFORMS parameters 
specified are: 

Test Time: 30 Seconds 
File Sizes: 1 K to 1 0K Bytes 
Increment: 1 K Bytes 



Server 

Compaq, Model CP3301, 486DX-50, EISA-Bus, 8MB 

DRAM 

SONIC-EISA PLX 32-Bit Busmaster Ethernet Card 

Novell Speed Rating: 1327 

Workstations 

1 . AST Premium, Model 5, 486DX-33, EISA-Bus 

2. PC Brand, Model A84310, 486DLC-33 (Cyrix) 

3. PC Brand, Model A84210, 486DX2/50, ISA-Bus 

4. Clone, Western Digital Motherboard Model 
WDAP4200 (Piranha 4200). ISA-Bus 

5. Dell Model 433DE, 486DX-33, EISA-Bus 



FIGURE 2. Test Environment 
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FIGURE 1. Network Delays 
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DETERMINING CLIENT SUITABILITY 

One of the favorite tests proposed by Ethernet vendors uti- 
lizes a high speed server with only one client. It is important 
for the validity of this test that the server is not the bottle- 
neck in the system. For this benchmark, each client card 
was tested in the AST machine. It should be noted that 
EISA machines tend to penalize I/O mapped designs be- 
cause I/O cycles are slower than in ISA machines. 



These results indicate that the 3Gom® adapter was the fast- 
est one available, which might be true if you could find a 
network with only one client. A more realistic test would 
include some kind of network loading. In order to determine 
how these boards would perform in a more typical network, 
five client machines were loaded with five copies of each 
vendor's card. As shown in Figure 4, these adapters all per- 
form within 2.5% of each other when evaluated in a reason- 
able environment (the raw data is provided at the end of this 
note). This delta is well within the margin of error, so 
throughput differences among vendors are slight at best. 
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FIGURE 3. Single Workstation, Single Server Throughput Results 



Five Workstations, All Same Client Cards 
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FIGURE 4. Five Workstations, Single Server Throughput Results 



HOW DO I CHOOSE A CLIENT SOLUTION? 

Given that all cards perform pretty much the same in a 
workstation on a loaded network, how does one choose 
between one supplier or another? This decision is usually 
based on available software and compatibility with existing 
hardware. As far as software drivers go, you would like to 
choose the architecture that shows up as an install option 
on the menu of the most software packages. That way, 
there is an extensive Installed base of software and driver 
support provided by the software supplier. Hardware com- 
patibility means that the adapter card or motherboard solu- 
tion will migrate between different platforms without extra 
effort. Bus-mastering, while ideal for servers (see next sec- 
tion), tends to have compatibility problems with ISA based 
machines. The reason for this Is that ISA machines were not 
designed to accommodate bus-mastering devices. Each 
core logic chip set has different timing parameters which 
affect the operation of the bus. The I/O and shared memory 
adapters tend to have the least problems because this Is 
the standard method of interfacing to slave cards on the ISA 
bus. 



SERVER SUITABILITY 

Server applications require a different assessment of per- 
formance than clients because the server must react to 
nearly every packet that appears on the network. Clients are 
only responsible for their small percentage of the overall 
network load. If a client has low throughput, only the client is 
affected. A slow server, on the other hand, will lead to a 
slow network. Another metric to be considered is CPU utili- 
zation since a server with no leftover CPU bandwidth may 
drop packets or be unable to run multiple modules of soft- 
ware. 

In order to determine the suitability of different cards for a 
server, NE2000 cards were placed in the five client ma- 
chines to represent a constant load. The CPU figures quot- 
ed in Figure 5 represent the CPU utilization given by No- 
vell's MONITOR program running on the server. Although 
this is a common practice, that figure represents more than 
just the bandwidth requirements for the transfer of packets. 
32-Bit server adapters tend to have less CPU utilization than 
16-blt cards due to double work transfers. 



Server Throughput Test 
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FIGURE 5. Server Throughput and CPU Utilization 



The optimum solution for a server card depends on the type 
of networl< that will be used. If a networl< will only consist of 
five clients and needs little room to expand, the NE2000 
card would be the choice because of its low cost and hard- 
ware stability. On the other extreme, if a user needs to put 
more than a few Ethernet cards in the server to create multi- 
ple segments, the NE3200 would be the choice because it 
has the lowest overall CPU utilization. This card would excel 
in heavily loaded servers because of an embedded micro- 
controller that is running the protocol and thus unburdening 
the host CPU. A useful measure for the average server ap- 
plication would combine performance and CPU utilization 
into one figure of merit. For the purpose of this evaluation. 
Figure 6 illustrates the various cards as ranked by 
THROUGHPUT/CPU UTILIZATION for the data shown in 
Figure 5. 



CONCLUSION 

In conclusion, Ethernet cards should be evaluated in a real- 
istic environment as opposed to special cases that may 
highlight certain aspects of the cards. As Ethernet has be- 
come a mature technology, the throughput of most cards 
has approached its bandwidth limitations. Ideally, the end 
user should run these benchmarks on the target environ- 
ment to obtain unbiased results. More realistic tests would 
include scripting of actual file transfers and possibly what 
effects the card may have on other aspects of the machine 
such as video performance. It should be noted that a faster 
hard drive on the server provides the most tangible perform- 
ance increase to the end user. 



Throughput/CPU Utilization 
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FIGURE 6. Server Throughput/CPU Utilization 



c 

ra 

E 

o 

c 

0) 
CQ 

o 

0) 



o 

00 
00 



DATA 
Figure 3 



Figure 4 



Card 


Avg. Throughput 
kbytes/S 


3Com ETHERLINK III 


731.57 


ALLIED TELESISAT1 500 


578.72 


SMC ELITE 


519.64 


AT/LANTIC (SHARED) 


514.77 


NE2000 


501.46 


INTEL ETHER XPRESS 


492.00 


AT/LANTIC (I/O) 


489.39 



Figures 5 and 6 



Card 


Avg. Throughput 
kbytes/S 


ATI 500 


1021.69 


3Com ETHERLINK III 


1015.16 


AT/LANTIC (SHARED) 


1005.78 


SMC ELITE 


1003.27 


XPRESS INTEL 


999.31 


AT/LANTIC (I/O) 


998.25 


NE2000 


996.66 



Card 


Avg. Throughput 
kbytes/S 


CPU 
Utilization (%) 


Throughput/ 
CPU Utilization 


PLX-SONIC 


992.76 


11.5 


86.33 


NE3200 


916.75 


11.0 


83.34 


MYLEX (EISA) 


981.49 


16.0 


61.34 


SONIC- AT 


986.15 


17.0 


58.01 


SMC ELITE 


953.49 


35.5 


26.86 


3Com ETHERLINK III 


960.81 


37.0 


25.97 


AT/LANTIC (SHARED) 


966.12 


38.6 


25.03 


HP ETHERTWIST 


946.59 


42.5 


22.27 


XPRESS INTEL 


932.33 


50.0 


18.65 


NE2000 


930.91 


70.0 


13.30 


AT/LANTIC (I/O) 


907.68 


71.6 


12.68 



LIFE SUPPORT POLICY 

NATIONAL'S PRODUCTS ARE NOT AUTHORIZED FOR USE AS CRITICAL COMPONENTS IN LIFE SUPPORT 
DEVICES OR SYSTEMS WITHOUT THE EXPRESS WRITTEN APPROVAL OF THE PRESIDENT OF NATIONAL 
SEMICONDUCTOR CORPORATION. As used herein; 

1. Life support devices or systems are devices or 2. A critical component is any component of a life 



systems which, (a) are intended for surgical implant 
into the body, or (b) support or sustain life, and whose 
failure to perform, when properly used in accordance 
with instructions for use provided in the labeling, can 
be reasonably expected to result in a significant injury 
to the user. 



support device or system whose failure to perform can 
be reasonably expected to cause the failure of the life 
support device or system, or to affect its safety or 
effectiveness. 
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